PyDigger - unearthing stuff about Python

Found 31 out of 334,203. Showing 20 on page 1. Total pages: 2.

Name	Version	Summary	date
mawo-razdel	1.0.2	Продвинутая токенизация для русского языка с SynTagRus паттернами и +25% точностью	2025-11-01 08:58:42
durak-nlp	0.2.3	Durak: modular Turkish NLP preprocessing toolkit.	2025-10-30 01:39:43
maze-dataset	1.4.1	generating and working with datasets of mazes	2025-10-17 12:31:39
acoustic-solo-dadaGP	0.1.0	Guitar tablature tokenization and processing toolkit.	2025-10-10 20:21:49
bidnlp	0.1.4	A Comprehensive Persian (Farsi) Natural Language Processing Library	2025-10-09 09:23:56
aynlp	0.1.3	AYNLP: A lightweight NLP toolkit built by Ankit and Yash for tokenization, stemming, lemmatization, and more. Visit https://github.com/aijadugar/AYNLP to explore the project.	2025-10-07 06:43:56
tamil-utils	0.4.0	Tiny Tamil text utilities: normalize, tokenize, stopwords, graphemes, n-grams, syllables, Tamil collation; dataset preprocessor; optional spaCy tokenizer hook.	2025-09-17 14:24:34
wizardspell	1.0.0	Dictionary-based spell checking with Unicode-aware tokenization and light normalization. Supports 62 languages via compressed Marisa-Trie dictionaries and returns a compact report of misspellings.	2025-08-28 15:46:47
nupunkt-rs	0.1.1	High-performance Rust implementation of nupunkt sentence/paragraph tokenization	2025-08-16 02:05:56
tokker	0.3.9	Tokker: a fast local-first CLI tokenizer with all the best models in one place	2025-08-09 17:59:33
chunkipy	1.0.0.post1	Chunkipy is an easy-to-use library for chunking text based on the size estimator function you provide.	2025-08-08 12:37:03
ultranlp	1.0.6	Ultra-fast, comprehensive NLP preprocessing library with advanced tokenization	2025-08-02 10:21:43
sakurs	0.1.1	Fast, parallel sentence boundary detection using Delta-Stack Monoid algorithm	2025-07-27 15:33:39
llmvision	0.1.1	Visualize how LLMs tokenize text	2025-07-26 07:34:54
miditok	3.0.6.post1	MIDI / symbolic music tokenizers for Deep Learning models.	2025-07-22 12:41:12
rs-bpe	0.1.0	A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust	2025-03-19 05:58:24
nlpashto	0.0.25	Pashto Natural Language Processing Toolkit	2025-02-01 11:17:44
code-tokenize	0.2.1	Fast program tokenization and structural analysis in Python	2025-01-14 09:17:25
rftokenizer	2.3.0	A character-wise tokenizer for morphologically rich languages	2024-12-17 19:05:30
alphacodings	0.2.0	base26 ([A-Z]) and base52 ([A-Za-z]) encodings	2024-12-09 03:04:43

Found 31 out of 334,203. Showing 20 on page 1. Total pages: 2.

first prev next last